Using two retrotransposon-based marker systems (SRAP and REMAP) for genetic diversity analysis of Moroccan Argan tree

The Argania is an endemic genetic resource in Morocco holding an important ecological and socio-economical benefit. However, overgrazing and overharvesting lead to a serious downturn in the number of trees. To characterize genetic diversity within and among 24 populations, represented by 240 argan trees, four combinations of SRAP primers and eight combinations of REMAP primers were used. A total of 338 REMAP and 146 SRAP markers were amplified with a polymorphism of 100%. The average polymorphism information content value was 0.20 and 0.17 for SRAP and REMAP markers, respectively. The analysis of molecular variance showed that 26% of the genetic variation was partitioned among populations. The coefficient of gene differentiation was 0.2875 and gene flow was 1.2391. The average parameter diversity was: observed number of alleles (Na)=0.729, effective number of alleles (Ne)=1.131, Shannon’s information index (I)=1.143; Nei’s gene diversity (H)=0.093 and Percentage of Polymorphic Loci=35.68. The STRUCTURE and principal coordinate analysis revealed that the Argania spinosa L. populations were aggregated into 2 genetic groups. To detect outlier, baysecan software was used and 21 were detected (7 under selection, 14 under balancing selection) presenting posterior probability higher than 0.79. The current results can be explored in the design of management programs and to comprehend the adaptation mechanism of Argan tree.


INTRODUCTION
The argan tree (Argania spinosa L.) is the unique representative of the Sapotaceae family in Morocco, it is an endemic species of the country where it occupies the second place after the holm oak. It is widely distributed in the southwestern region of Morocco with an area of more than 900,000 ha [1]. Its main interest lies in its fruit, that gives very valuable oil for therapeutic, cosmetic and food uses [2,3]. Also, the argan tree plays a remarkable and irreplaceable role in the ecological balance. Because of its powerful root system, it allows fighting against water and wind erosion and contributes to the maintenance of the soil. However, due to repeated droughts, excessive exploitation and insufficient natural regeneration, the argan forest has undergone alarming deterioration in less than a century. More than its half has disappeared and its average density has decreased from 100 to 30 trees/ha and even 10 trees per hectare [4]. In the face of this accelerated degradation, it is extremely important to develop adequate and effective strategies to conserve, restore and enhance this genetic resource. Therefore, the study of genetic variability proves a crucial step in understanding the state of diversity available and its structure and provides valuable information to exploit genetic diversity within an effective conservation program.
In this investigation, we have chosen to study the genetic diversity of the natural populations of the argan tree by two molecular markers, Sequence-related amplified polymorphism (SRAP) and Retrotransposon-microsatellite amplified polymorphism (REMAP), based on the transposable elements, since they constitute a major component of plant genomes [5,7]. SRAP is an appropriate molecular marking system for the analysis of plant genetic diversity and it has many features such as simplicity, reliability, flexibility, multiple loci detection and cost-effectiveness [8]. REMAPs exploit scattered and abundant repetitions of sequences such as retrotransposon LTRs. The association of these sequences together makes possible to amplify a series of bands (DNA fingerprints), using primers homologous to these large numbers of repeated copies. The markers generated are highly informative genetic markers [9]. These two approaches, whose effectiveness and credibility in the analysis of genetic variability have been demonstrated in many plant species, especially woody species [10,13] will enable the accurate assessment of genetic diversity and better understand the structure of populations and contribute to the use and the effective conservation of the argan tree. In reviewing all the genetic diversity studies carried out on the argan tree using molecular markers including RFLP of chloroplast DNA [14], SSR [15], ISSR [16], AFLP [17] and IRAP [18], the present work constitutes originality. In fact, it is the first application of these two SRAP and REMAP marking strategies on the argan tree.

MATERIALS AND METHODS
Collection of plant materials and genomic DNA extraction: 240 genotypes were selected from 24 parts of Morocco for the REMAP and SRAP analysis (Table 1). Fresh leaves were collected from the adult tree, cleaned with moist paper towels and preserved at -80°C. Genomic DNA was extracted from 50 mg dried leaf tissue according to the ISOLATE II Plant DNA Kit (Bioline, USA). The quality and quantity of the DNA were determined by spectrophotometry using a NanoDrop 2000 (NanoDrop Technologies Inc., USA) and by visual assessment on a 1% agarose gel.

Molecular analysis of SRAP:
The SRAP analysis was based on the protocol published by Keify and Beiki [19]. The reaction mixture contained 25 μl of 20 ng of template DNA, 1 x Taq buffer, 2.5 mM MgCl2, 0.2 mM dNTP, 0.25 µM primer and Taq 1 U DNA polymerase (Promega). Amplification was carried out in a Veriti 96-Well Thermal Cycler. The PCR reaction program consisted of initial denaturation at 94°C for 3 min, followed by 5 cycles of denaturation at 94°C for 1 min, annealing at 35 °C for 1 min and extension at 72°C for 1 min, followed by 35 cycles of denaturation at 94°C for 1 min, annealing at 50°C for 1 min and extension at 72°C for 1 min and a final extension at 72°C for 10 min.
A total of 20 SRAP primer combinations were tested of which 4 primer combinations showed reproducible and distinct amplification of the loci were selected. PCR products were electrophoresed on 2% agarose gel (UltraPure-Invitrogen) for 2-3 h at 90 volts, stained with ethidium bromide and photographed under UV light using EnduroTM GDS (Labnet, USA). The SRAP images were transferred to the Gelcompar II software. Only the clear fragments in the size range of 50 bp to 2050 bp bases were marked as present. The binary data was exported and verified for further statistical analyses. Molecular analysis of REMAP: Eight REMAP primer combinations were selected for further analysis. The REMAP-PCR was performed in 25 µl reaction mixtures containing: 8ng/µl of DNA, 2 µM of each primer, and 13 µl MyTaq HS Mix (BIOLINE). PCR cycles started with an initial denaturation step at 94 °C for 1 min followed by 35 cycles of denaturation at 95 °C for 15s, annealing at the specific annealing temperature for 15 s and extension at 72 °C for 10 min. Gel electrophoresis and notation bands were performed as described for the SRAP. The fragments in the size range of 50 bp to 1500 bp bases were marked as present.
Data analysis: Genotypic data was used to evaluate the efficacy of each primer used in each marker system. The polymorphic information content (PIC) was determined according to Roldán-Ruiz et al., [20], the marker index (MI) was calculated using the formula given by Powell et al., [21], while the resolving power (Rp) was calculated according to Prevost and Wilkinson [22] based on the distribution of alleles of all 240 genotypes of A. spinosa L.
Genetic diversity and population structure analyses: The combined binary data matrix of markers SRAP and REMAP was used to estimate genetic differentiation (Gst) and gene flow (Nm) using POPGENE v.1.32 [23]. The genetic diversity parameters such as the percentage of polymorphic loci (%P), the Shannon information index (I), the genetic diversity of Nei (H), the effective number of alleles (Ne) and the number of alleles observed (Na), were estimated using GenAlEx version 6.5 [24]. The same software was also used to estimate the partition of genetic variance among and within populations by the analysis of molecular variance (AMOVA) and to ordinate relationships among populations based on genetic distance matrix by Principal Coordinates Analysis (PcoA).
The population structure was carried out based on the REMAP and SRAP genotypes using the Bayesian clustering model implanted in STRUCTURE software version 2.3 [25]. The number of population K was tested from 1 to 15 with a burn-in time period of 10000 followed by 100000 Markov Chain Monte Carlo (MCMC) iterations. The program was run 20 times for each K number based on admixture model and on correlated allele frequencies. The highest value of ΔK, representing the most likely number of clusters [26], was determined using Structure Harvester [27].
The candidate loci in the selection was identified using a Bayesian approach based on Fst described by Beaumont and Balding [28] implemented in BayeScan software version 2.1 [29]. On the basis of dominant marker data, the analysis directly estimates the probability that a locus is under selection by calculating a ratio of the posterior probabilities of under selection models or neutral models [30]. The default parameters were used: 20 pilot runs, a burn-in period of 50,000 iterations and 5,000 output iterations with a thinning interval of 10. The outlier markers were selected starting from 0.79 posterior probabilities. To eliminate the false positives among the outliers found, the rate of false discovery (FDR) was calculated. Thus, BayeScan program determines a q-value, which is the FDR analogue of the p-value. The loci with a q-value over 1% were retained as outlier.

RESULTS
To reveal the genetic diversity and outlier markers for 240 individuals of argan tree two markers (SRAP and REMAP) were used in this study. Four SRAP out of 20 primers and 8 REMAP gave polymorphism and were selected for data analysis. Four selected SRAP markers amplified a total of 147 amplicons ranged from maximum 45 (ME-1/EM-4) to minimum 32 (ME-2/EM-4) with an average of 36.75 bands displaying an overall polymorphism of a 100%. The polymorphism information content (PIC) showed an average of 0.20, the highest PIC was 0.21 showed by ME-2/EM-4 and ME-4/EM-4. The mean of resolving power (Rp) and markers index (MI) was 10.74 and 7.33 respectively while the highest value Rp (12.51) was generated by ME-2/EM-6 and by ME-1/EM-4 for MI (11.53). The eight REMAP primer combinations also showed a high polymorphism (100%) amplifying 338 polymorphic bands with an average of 42.25. The REMAP primer combinations IRAP3/ISSR2 and IRAP1/ISSR2 represent the lowest (34) and the highest (52) Table 2.  Table 3). The results of AMOVA analysis revealed that variation within populations (74%) was significantly higher than that among populations (26%). Similarly, The Gst value was 0.2875 and the gene flow was 1.2391 (Table 4).  To visualize the relationships inter populations and elucidate the genetic structure of 24 populations of argan tree analysis PCOA and STRUCTURE were used. The three coordinates of PCoA analysis displayed 51.49 % of total variation. The first axis explained 10.95 % of the variance, while the second and the third ones displayed 18.48% and 22.51% of the variance respectively (Fig. 1).
A bayesian analysis was performed with K value (possible cluster number) ranging from 1 to 15. The second order statistics (ΔK) developed by Evanno et al., [26] showed a clear maximum for K = 2 (ΔK = 233.762571) indicating that the 24 populations could be grouped into two genetic clusters (Fig. 2). The first cluster mainly consisted of 14 populations (2, 3, 4, 5, 6, 7, 8, 13, 14, 15, 16, 17, 18, 19 and 23) and the second cluster mainly consisted of 10 populations (1, 9, 10, 11, 12, 20, 21, 22 and 24). At the same time, the second largest ΔK at K = 3 was much larger than the remaining values. On the basis of K=3, the first cluster was separated into two new genetic clusters. Some populations formed a clear separation indicating high differentiation.  Table 1.  [26] plotted against the number of genotype clusters (K), where K = 2 is the best fit of the populations followed by k=3. c) STRUCTURE output for K= 2 and d) K = 3. For populations numbers see Table 1.

DISCUSSION
The efficiency of genetic diversity analysis depends largely on the polymorphism detected by molecular markers. In this study, SRAP and REMAP markers were used to assess genetic diversity among 240 argan genotypes representing 24 natural ecosystems. These markers are highly polymorphic due to their abundance in plants genomes and their dynamic nature [5,7]. Their efficiency to assess genetic diversity has been proven in different studies such as [10,11,13,31,34]. Also, many other studies suggest that retrotransposon-based molecular markers are more useful and more informative than other markers such as RAPD, AFLP or SSR [35,37].
Out of all primer combinations tested, 4 SRAP and 8 REMAP amplified good polymorphism and scorable banding patterns. These primers were selected to study genetic diversity of argan tree.
Despite the fact that the mean number of polymorphic bands amplified by REMAP was higher to that observed using the SRAP, the two markers have revealed a high level of polymorphism (100%). In addition, evaluation of informativeness and discriminatory power of the two marker systems by calculating the PIC, Rp and MI has shown higher mean values for SRAP compared to REMAP. The result found in this study is comparable to other studies suggesting that SRAP is the most efficient technique to reveal informative bands [38,39].
However, the combined use of different approaches could give more reliable information about genetic diversity [40,41]. In fact, the two markers used in this study differ by their target genome region. The REMAP technique relies on the amplification of sequence between a microsatellite and retrotransposon [42] while the SRAP technique detects polymorphisms by amplifying the open reading frames (ORFs) [43].
Consequently, the combined data will enable good genome coverage. Thus, the results of the two molecular techniques were combined and the data of 484 loci was obtained. Based on these data, the analyses of the molecular variance (AMOVA) was used to evaluate the genetic variability in different populations and revealed that most of the variance was attributed to the divergence within populations. This has been confirmed by the coefficient of genetic differentiation (Gst = 0.2875). These results are consistent with those found in other research studies focusing on genetic diversity of argan tree using IRAP-ISSR and AFLP markers [17,18] and can be explained firstly by the habitat of argan trees naturally fragmented. This distribution leads to a high level of genetic differentiation among populations of argan tree in comparison with other forest trees [44]. Secondly, the argan tree can reproduce with double pollination system by anemophily or entomophily [44]. Thus, the individual exchange genes with its closest creating a high gene flow between populations geographically close. This result was confirmed by PCoA and revealed that the majorities of the populations belonging to the same region have been regrouped except the populations of region 4 (Beniznsaen and Oued Grou) which can be a result of recent dispersion probably caused by human activities. The bayesian analysis divided all of the 24 populations into two genetic pools (K=2). This structuration can result from different factors, either historic, geographic or genetic [47]. Moreover, the populations of argan tree exist in Morocco under different climates (Sub-humid, arid, semi-arid and Saharan) which can cause the variations of flowering period and as a consequence preventing gene flow between different populations.
The parameters of genetic diversity were also calculated from the combined data of the two markers and the highest values were found by Mramer population. This population should have priority for conservation. In general, the revealed genetic diversity was higher than the values found in endangered or endemic forest species.
However, the values found in this study were higher than those found by Pakhrou et al., [18] using combined data of IRAP and ISSR markers for assessing the genetic diversity of the same populations. This comparison reveals that the indices of genetic diversity can vary according to the techniques used and confirm that retrotransposon-based molecular markers were more useful to detect genetic variations. Hence, the comparison of the ability of each marker to assess genetic diversity of argan tree seems to be a good perspective and will enable a better understanding of their efficiency and utility and could also provide valuable information.
The combined data has been also the subject to detect outlier. Although retrotransposonbased molecular markers are assumed to be neutral markers.
These markers generate a good number of polymorphic loci, which also can match gene encoding specific proteins. Recently, many dominants markers have been used to detect outlier such as SRAP and ISSR [48][49][50]. In the present investigation, the software BayeScan was selected for its performance to detect outlier and to reduce the number of false positive [51,52] and twenty-one outliers were detected (7 under selection and 14 under balancing selection). Our results are in line with the result of other studies using dominant markers to detect outlier and reject the null hypothesis based on neutral allele model [45]. The basis genetic found in the present study can be used to investigate the adaptability of argan tree to the variations of the climate, especially by analyzing the correlation between climatic variable (temperature and precipitation) and adaptive loci, and to predict the response of argan tree to future climate changes.
The study of genetic diversity is one of the crucial and essential steps for effective conservation and preservation of argan forest. This is the first report of genetic diversity study of Argania spinosa L. using REMAP and SRAP markers. A total of 240 tree of argan trees were collected from 24 regions representing its natural distribution. The results found in the present study indicate that high genetic differentiation was found mainly within population and showed a clear genetic structure which grouped the 24 populations into two genetic groups. this study demonstrated that SRAP and REMAP markers provide appropriate information for genetic structure of argan tree and could be used as an efficient tool for the detection the molecular relationships of trees. The data can be explored to study adaptation mechanisms of argan tree.